Search CORE

12 research outputs found

Confident Object Detection via Conformal Prediction and Conformal Risk Control: an Application to Railway Signaling

Author: Andéol Léo
De Grancey Florence
Fel Thomas
Mossina Luca
Publication venue
Publication date: 12/04/2023
Field of study

Deploying deep learning models in real-world certified systems requires the ability to provide confidence estimates that accurately reflect their uncertainty. In this paper, we demonstrate the use of the conformal prediction framework to construct reliable and trustworthy predictors for detecting railway signals. Our approach is based on a novel dataset that includes images taken from the perspective of a train operator and state-of-the-art object detectors. We test several conformal approaches and introduce a new method based on conformal risk control. Our findings demonstrate the potential of the conformal prediction framework to evaluate model performance and provide practical guidance for achieving formally guaranteed uncertainty bounds

arXiv.org e-Print Archive

Gradient strikes back: How filtering out high frequencies improves explanations

Author: Andéol Léo
Fel Thomas
Muzellec Sabine
Serre Thomas
VanRullen Rufin
Publication venue
Publication date: 18/07/2023
Field of study

Recent years have witnessed an explosion in the development of novel prediction-based attribution methods, which have slowly been supplanting older gradient-based methods to explain the decisions of deep neural networks. However, it is still not clear why prediction-based methods outperform gradient-based ones. Here, we start with an empirical observation: these two approaches yield attribution maps with very different power spectra, with gradient-based methods revealing more high-frequency content than prediction-based methods. This observation raises multiple questions: What is the source of this high-frequency information, and does it truly reflect decisions made by the system? Lastly, why would the absence of high-frequency information in prediction-based methods yield better explainability scores along multiple metrics? We analyze the gradient of three representative visual classification models and observe that it contains noisy information emanating from high-frequencies. Furthermore, our analysis reveals that the operations used in Convolutional Neural Networks (CNNs) for downsampling appear to be a significant source of this high-frequency content -- suggesting aliasing as a possible underlying basis. We then apply an optimal low-pass filter for attribution maps and demonstrate that it improves gradient-based attribution methods. We show that (i) removing high-frequency noise yields significant improvements in the explainability scores obtained with gradient-based methods across multiple models -- leading to (ii) a novel ranking of state-of-the-art methods with gradient-based methods at the top. We believe that our results will spur renewed interest in simpler and computationally more efficient gradient-based methods for explainability

arXiv.org e-Print Archive

Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis

Author: Andéol Léo
Kanamori Takafumi
Nakagawa Takumi
Wada Yuichiro
Waida Hiroki
Zhang Yuhui
Publication venue
Publication date: 01/04/2023
Field of study

Contrastive learning is an efficient approach to self-supervised representation learning. Although recent studies have made progress in the theoretical understanding of contrastive learning, the investigation of how to characterize the clusters of the learned representations is still limited. In this paper, we aim to elucidate the characterization from theoretical perspectives. To this end, we consider a kernel-based contrastive learning framework termed Kernel Contrastive Learning (KCL), where kernel functions play an important role when applying our theoretical results to other frameworks. We introduce a formulation of the similarity structure of learned representations by utilizing a statistical dependency viewpoint. We investigate the theoretical properties of the kernel-based contrastive loss via this formulation. We first prove that the formulation characterizes the structure of representations learned with the kernel-based contrastive learning framework. We show a new upper bound of the classification error of a downstream task, which explains that our theory is consistent with the empirical success of contrastive learning. We also establish a generalization error bound of KCL. Finally, we show a guarantee for the generalization ability of KCL to the downstream classification task via a surrogate bound

arXiv.org e-Print Archive

Learning Domain Invariant Representations by Joint Wasserstein Distance Minimization

Author: Andéol Léo
Kanamori Takafumi
Kawakami Yusei
Montavon Grégoire
Müller Klaus-Robert
Wada Yuichiro
Publication venue
Publication date: 09/06/2021
Field of study

Domain shifts in the training data are common in practical applications of machine learning, they occur for instance when the data is coming from different sources. Ideally, a ML model should work well independently of these shifts, for example, by learning a domain-invariant representation. Moreover, privacy concerns regarding the source also require a domain-invariant representation. In this work, we provide theoretical results that link domain invariant representations -- measured by the Wasserstein distance on the joint distributions -- to a practical semi-supervised learning objective based on a cross-entropy classifier and a novel domain critic. Quantitative experiments demonstrate that the proposed approach is indeed able to practically learn such an invariant representation (between two domains), and the latter also supports models with higher predictive accuracy on both domains, comparing favorably to existing techniques.Comment: 20 pages including appendix. Under Revie

arXiv.org e-Print Archive

Institutional Repository of the Freie Universität Berlin

A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

Author: andéol Léo
Bethune Louis
Boutin Victor
Cadène Rémi
Chalvidal Mathieu
Fel Thomas
Moayeri Mazda
Serre Thomas
Publication venue
Publication date: 11/06/2023
Field of study

In recent years, concept-based approaches have emerged as some of the most promising explainability methods to help us interpret the decisions of Artificial Neural Networks (ANNs). These methods seek to discover intelligible visual 'concepts' buried within the complex patterns of ANN activations in two key steps: (1) concept extraction followed by (2) importance estimation. While these two steps are shared across methods, they all differ in their specific implementations. Here, we introduce a unifying theoretical framework that comprehensively defines and clarifies these two steps. This framework offers several advantages as it allows us: (i) to propose new evaluation metrics for comparing different concept extraction approaches; (ii) to leverage modern attribution methods and evaluation metrics to extend and systematically evaluate state-of-the-art concept-based approaches and importance estimation techniques; (iii) to derive theoretical guarantees regarding the optimality of such methods. We further leverage our framework to try to tackle a crucial question in explainability: how to efficiently identify clusters of data points that are classified based on a similar shared strategy. To illustrate these findings and to highlight the main strategies of a model, we introduce a visual representation called the strategic cluster graph. Finally, we present https://serre-lab.github.io/Lens, a dedicated website that offers a complete compilation of these visualizations for all classes of the ImageNet dataset

arXiv.org e-Print Archive

Conformal Prediction for Trustworthy Detection of Railway Signals

Author: Andéol Léo
Fel Thomas
Grancey Florence De
Mossina Luca
Publication venue: HAL CCSD
Publication date: 23/01/2023
Field of study

We present an application of conformal prediction, a form of uncertainty quantification with guarantees, to the detection of railway signals. State-of-the-art architectures are tested and the most promising one undergoes the process of conformalization, where a correction is applied to the predicted bounding boxes (i.e. to their height and width) such that they comply with a predefined probability of success. We work with a novel exploratory dataset of images taken from the perspective of a train operator, as a first step to build and validate future trustworthy machine learning models for the detection of railway signals

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

Confident Object Detection via Conformal Prediction and Conformal Risk Control: an Application to Railway Signaling

Author: Andéol Léo
Fel Thomas
Grancey Florence De
Mossina Luca
Publication venue: HAL CCSD
Publication date: 12/04/2023
Field of study

HAL-INSA Toulouse

Confident Object Detection via Conformal Prediction and Conformal Risk Control: an Application to Railway Signaling

Author: Andéol Léo
Fel Thomas
Grancey Florence De
Mossina Luca
Publication venue: HAL CCSD
Publication date: 12/04/2023
Field of study

Scientific Publications of the University of Toulouse II Le Mirail

Conformal Prediction for Trustworthy Detection of Railway Signals

Author: Andéol Léo
Fel Thomas
Grancey Florence De
Mossina Luca
Publication venue: HAL CCSD
Publication date: 23/01/2023
Field of study

HAL-INSA Toulouse